-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return EOF and ErrUnexpectedEOF correctly #64
Conversation
c098cce
to
6eb0095
Compare
I pushed a small fix to make link scanning a bit faster, but it's still pretty terrible. Also, I think we need to update "has read once" in all the "read a byte" methods, which will make performance worse. You can test performance with |
The ScanForLinks method is pretty small so we could avoid allocation there like in my latest commit. |
Yeah, the performance seems reasonable now. It's slightly slower, but not by much. |
Also regenerate tests, that's how I found this.
Hm. Actually, it's about 2x slower on decode and has 3x the number of allocations. I needed to regenerate the tests to find that. |
So, I've done what I can, but it's still much slower. Remaining things are:
I recommend testing with: go test -bench . -benchmem -memprofile=mem.out -cpuprofile=cpu.out Then look at the allocation counts and cpu profile with pprof. |
oh I see my mistake, I didn't realize I needed to regen the |
https://gist.github.com/MarcoPolo/b72fdd0e43e04d0721a5056b4f579190 Now we're at fewer allocs than before after the GetPeeker change, but still 8 more than before. |
ah I see what's happening. We pass in the reader from GetPeeker to the nested calls to UnmarshalCbor. And that isn't a ReaderWithEOFContext so we allocate a new one. (Also I think my last commit subtlety breaks this since it essentially unwraps the Reader from ReaderWithEOFContext and dropping the context). I wonder if doing a defer at the top level would also solve this without requiring us to wrap the reader. I'll try this tomorrow. Thanks for the tips :) |
f178403
to
22488b2
Compare
Alright I'm using defer now to remove the extra allocation from the wrapped reader. My understanding of simpler defer func statements is that they are inlined by the compiler (please correct me if I'm wrong). I updated the benchmark results here: https://gist.github.com/MarcoPolo/b72fdd0e43e04d0721a5056b4f579190 So no extra allocation compared to master, but a smidge slower. Also I think this needs a more careful review since it now relies on us doing this at the first read in Unmarshal/scanforlinks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually seems quite a bit cleaner.
From my benchmarks, any perf difference is noise. But we should get a third-party review from someone not involved. |
Let me add a test case here too |
(test currently fails, I'll investigate in a bit) |
Ah... we're doing error wrapping internally. |
yup, I switched to using errors.Is. Which does make me a bit uneasy since it needs to traverse the error chain to see if an error matches. The benchmarks don't show much though: https://gist.github.com/MarcoPolo/b72fdd0e43e04d0721a5056b4f579190 |
gen.go
Outdated
if err == io.EOF { | ||
err = io.ErrUnexpectedEOF | ||
if errors.Is(err, io.EOF) { | ||
err = xerrors.Errorf("%w: %v", io.ErrUnexpectedEOF, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't quite work. %w
needs to be last.
It's probably fine to just leave it as-is? The only thing we really care about is not returning an EOF if we've read bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it? https://go.dev/play/p/k2wOyoCk4WO makes it seem like both work.
I figured we cared enough to wrap the original error, so might be nice to keep that info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, being last only matters for Unwrap
. (docs: https://pkg.go.dev/golang.org/x/xerrors#Errorf)
I figured we cared enough to wrap the original error, so might be nice to keep that info.
My point was: that we could just leave everything as-is. I.e., we don't want an EOF unless we read no data, but we're fine with any other error (including a wrapped EOF) otherwise.
But if this works and doesn't hurt performance too much, then it's fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, being last only matters for Unwrap. (docs: https://pkg.go.dev/golang.org/x/xerrors#Errorf)
From that link
If the format specifier includes a %w verb with an error operand in a position other than at the end, the returned error will still implement an Unwrap method returning the operand, but the error's Format method will not return the wrapped error.
What is the "error's Format method" ?
anyways, I understand what you mean by as-is. I changed this to return the wrapped error and only replace if we return EOF. I also changed the test to only check that we aren't returning EOF exactly if it could read some bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the "error's Format method" ?
🤷♂️ I'm having a lot of difficulty with those docs.
Friendly bump on this @Stebalien :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Context
Fixes #62. See that issue for context.
Proposed solution
At first I considered trying to find all the spots after our first read and return
ErrUnexpectedEOF
, but that was error prone.Instead I created a new wrapper type that implements
io.Reader
but keeps context on whether this is the first read or not. If it is the first read it returns what the inner reader returns. Otherwise, if it isn't the first read and the inner reader returnsEOF
, this reader knows that we ran into anEOF
in the context of expecting to read more data from this inner reader, so the wrapper reader returnsErrUnexpectedEOF
instead ofEOF
.This does add overhead, so I'm curious to see if we notice it.
r? @Stebalien